Brief Announcement: Randomized Shared Queues

نویسندگان

  • Hyunyoung Lee
  • Jennifer L. Welch
چکیده

This paper presents a speci cation of a randomized shared queue that can lose some elements or return them out of order (not in FIFO), shows that the speci cation can be implemented over the probabilistic quorum algorithm of [4, 3], and analyzes the behavior of this implementation. Distributed algorithms that can tolerate some lost and outof-order messages are candidates for replacing the message queues with random queues. The modi ed algorithms will inherit positive attributes concerning load and availability from the underlying queue implementation. The behavior of an application { a class of combinatorial optimization algorithms { when it is implemented using random queues is analyzed. 1. SPECIFICATION OF RANDOM QUEUE Queues are a fundamental concept in many areas of computer science. A common application in distributed computing are message queues in communication networks. Many distributed algorithms use high-level communication operations, such as scattering or all-to-all broadcasts (cf. Chapter 1 of [2] for an overview). These algorithms can typically tolerate inaccuracies in the order in which the queue returns its elements, as the order of the elements in the message queue is typically impacted by the unpredictability of the communications network. We de ne a random queue to be a randomized version of a shared queue, of which some properties are relaxed such that the number of enqueued data items is not preserved and the items can be dequeued out of order (not in FIFO). A queue Q shared by several processes supports two operations, Enq(Q; v) and Deq(Q; v). Enqi(Q; v) is the invocation by process i to enqueue the value v, Acki(Q) is the response to i's enqueue invocation, Deqi(Q; v) is the invocation by i of a dequeue operation, and Reti(Q; v) is the A full version of this paper, which includes all proofs, is available as Technical Report: TR01-004, Department of Computer Science, Texas A&M; University, March, 2001. response to i's dequeue invocation which returns the value v. A possible return value is also ?, indicating an empty queue. The set of values from which v is drawn is unconstrained. We will focus on multi-enqueuer, single-dequeuer queues; thus, the enqueue can be invoked by all the processes while the dequeue can be invoked only by one process. Given a real number p that is between 0 and 1, a system is said to implement a p-random queue if the following conditions hold. (Liveness) every operation invocation has a following matching response; (Integrity) every operation response has a preceding matching invocation; (No Duplicates) for each value x, Deq(Q;x) occurs at most once; (Per Process Ordering) for all i, if Enqi(Q;x1) ends before Enqi(Q;x2) begins, then x2 is not dequeued before x1 is dequeued; (Probabilistic No Loss) for every enqueued value x, Pr[x is dequeued] p. 2. IMPLEMENTATION OF RANDOM QUEUE We now describe an implementation of a p-random queue. The next section computes the value of p, assuming that the application program using the shared queue satis es certain properties. The queue algorithm is based on the probabilistic quorum algorithm of Malkhi et al. [4]. There are r replicated memory servers. We begin by describing a random queue for the special case of a single enqueuer. The case of n 1 enqueuers is implemented over a collection of n single enqueuer queues. The enqueue operation (Enq) mirrors the probabilistic quorumwrite operation: The local timestamp is incremented by one and attached to the element that is to be enqueued. The resulting pair is sent to the replicas in the chosen quorum, a randomly chosen group of k servers. The key notion in the dequeue operation (SingleDeq) is a timestamp limit (T ). At any given time, all timestamps that are smaller than the current value T are considered to be outdated. T is included in the dequeue messages to the replica servers and allows them to discard all outdated values. Beyond this, SingleDeq mirrors the probabilistic quorum read operation: The client selects a random quorum, sends dequeue messages to all replica servers in the quorumand selects the response with the smallest timestamp td. Itupdates the timestamp limit to T := td + 1 and returns theelement that corresponds to td.Each replica server implements a conventional queue withaccess operations enqueue and dequeue. In addition, thedequeue operation receives the current timestamp limit asinput and discards all outdated values (e.g., by means of re-peated dequeue operations). The purpose of this is to ensurethat there are exactly k replica servers that will return theelement vT with timestamp T in response to a dequeue re-quest. Thus, the probability of nding this element (in thecurrent dequeue operation) is exactly the probability thattwo quorums intersect. This property is of critical impor-tance in the analysis in the following section. It does nothold if outdated values are allowed to remain in the replicaqueues, as those values could be returned instead of vT bysome of the replica servers containing vT .For the case of n 1 enqueuers, we extend the single-enqueuer, single-dequeuer queue for an n-enqueuer, single-dequeuer queue by having n copies of single-enqueuer queue,i.e., n single-enqueuer queues (Q1; : : : ; Qn), one per en-queuer. The i-th enqueuer (1 i n) enqueues to Qi.The single dequeuer dequeues from all n queues by makingcalls to the function Deq(), which selects one of the queuesand tries to dequeue from it.Deq() checks the next queue in sequence. The round-robinsequence can be replaced by any other queue selection crite-rion that queries all queues with approximately the samefrequency. The selection criterion will impact the orderin which elements from the di erent queues are returned.However, it does not impact the probability of any givenelement being dequeued (eventually), as the queues do nota ect each other, and the attempt to dequeue from an emptyqueue does not change its state.3. ANALYSIS OF RANDOM QUEUE IMPLE-MENTATIONFor this analysis, we assume that the application programinvoking the operations on the shared random queue satis esa certain property. Every complete execution consists ofa sequence of segments. Each segment is a sequence ofenqueues followed by a sequence of dequeues, which has atleast as many dequeues as enqueues. Fix a segment. Letme,resp., md, be the total number of enqueue, resp., dequeue,operations in this segment. Let m = me + md. Let Yibe the indicator random variable for the event that the i-thelement is returned by a dequeue operation (1 i me). Inthe following lemma, the probability space is given by theenqueue and dequeue quorums which are selected by thequeue access operations. More precisely, let Pk(r) denotethe collection of all subsets of size k of the set f1; : : : ; rg.Since there are m enqueue and dequeue operations, we let=Pk(r)m be the universe. The probability space for thefollowing lemma is given by the nite universe and theuniform distribution on .Lemma 1. The random variables Yi (1 i me) aremutually independent and identically distributed withPr(Yi = 1) = p = 1r kkrk ! :Theorem 1. The algorithm in Section 2 implements arandom queue.4. APPLICATION OF RANDOM QUEUE:GO WITH THE WINNERSIn this section we show how to incorporate random queuesto implement a generic randomized optimization algorithmcalled Go with the Winners (GWTW), which was proposedby Aldous and Vazirani [1]. We analyze how the weakerconsistency provided by random queues a ects the successprobability of the GWTW algorithm. Our goal is to showthat the success probability is not signi cantly reduced.A combinatorial optimization problem is given by a statespace S (typically exponentially large) and an objective func-tion f , which assigns a `quality' value to each state. The taskis to nd a state s 2 S, which maximizes (or minimizes) f(s).It is often suÆcient to nd approximate solutions. For ex-ample, in the case of the clique problem, S can be the setof all cliques in a given graph and f(s) can be the size ofclique s.In order to apply GWTW to an optimization problem,the state space has to be organized in the form of a treeor a DAG, such that the following conditions are met: (a)The single root is known. (b) Given a node s, it is easy todetermine if s is a leaf node. (c) Given a node s, it is easyto nd all child nodes of s. The parent-child relationshipis entirely problem-dependent, given that f(child) is betterthan f(parent). For example, when applied to the cliqueproblem on a graph G, there will be one node for each clique.The empty clique is the root. The child nodes of a cliques of size k are all the cliques of size k + 1 that contain s.Thus, the nodes at depth i are exactly the i-cliques. Theresulting structure is a DAG. We could have de ned a treeby considering ordered sequences of vertices.Greedy algorithms, when formulated in the tree model,typically start at the root node and walk down the tree un-til they reach a leaf. The GWTW algorithm follows thesame strategy, but tries to avoid leaf nodes with poor val-ues of f , by doing several runs of the algorithm simultane-ously, in order to bound the running time and boost thesuccess probability (success means a node is found with asuÆciently good value of f). We call each of these runs aparticle { which carries with it its current location in thetree and moves down the tree until it reaches a leaf node.The algorithm works in synchronous stages. During the k-thstage, the particles move from depth k to depth k+1. Eachparticle in a non-leaf node is moved to a randomly chosenchild node. Particles in leaf nodes are removed. To com-pensate for the removed particles, an appropriate numberof copies of each of the remaining particles is added.The main theme to achieve a certain constant probabilityof success is to try to keep the total number of particles ateach stage close to the constant B.The framework of the GWTW algorithms is as follows: Atstage 0, start with B particles at the root. Repeat the follow-ing procedure until all the particles are at leaves: At stagei, remove the particles at leaf nodes, and for each particleat a non-leaf node v, add at v a random number of parti-cles, this random number having some speci ed distribution.Then, move each particle from its current position to a childchosen at random.We consider a distributed version of the GWTW framework, presented as Algorithm 2 in the full paper. Consideran execution of Algorithm 2 on n processes. At the be-ginning of the algorithm (stage 0), B particles are evenlydistributed among the n processes. Since, at the end ofeach stage, some particles may be removed and some parti-cles may be added, the processes need to communicate witheach other to perform load balancing of the particles (globalexchange). We use shared-memory communication amongthe processes. In particular, we use shared queues to dis-tribute the particles among processes.When using random queues, the errors will a ect GWTW,since some particles disappear with some probability. How-ever, we show that this does not a ect the performance ofthe algorithms signi cantly. In particular, we estimate howthe disappearance of particles caused by the random queuea ects the success probability of GWTW.We now show that Algorithm 2 when implemented withrandom queues will work as well as the original algorithmsin [1].We use the notation of [1] for the original GWTW algo-rithm (in which no particles are lost by random queues): LetXv be a random variable denoting the number of particlesat a given vertex v. Let Si be the number of particles at thestart of stage i. At stage 0, we start with B particles. ThenS0 = B and Si =Pv2V` Xv; for i > 0, where V` is the setof all vertices at depth `. Let p(v) be the chance the par-ticle visits vertex v. Then a(j)=Pv2Vj p(v) is the chancethe particle reaches depth j at least. p(wjv) is de ned tobe the chance the particle visits vertex w conditioning on itvisits vertex v. The values si; 1 i < ` are constants whichgovern the particle reproduction rate of GWTWs. The pa-rameter is de ned to express the \imbalance" of the treeas follows: For i < j, ij = a(i)a2(j)Pv2Vi p(v)a2(jjv), and= max0 iAldous and Vazirani [1] proveLemma 2.ESi = Ba(i)si ; 0 i d; andvarSi B a2(i)si2 iXj=0 sja(j) ; 0 i d:We will use this lemma to prove similar bounds for thedistributed version of the algorithm, in which errors in thequeues can a ect particles. For this purpose, we formulatethe e ect of the random queues in the GWTW framework.More precisely, given any original GWTW tree T , we de-ne a modi ed tree T 0, which accounts for the e ect of therandom queues. Given a GWTW tree T , let T 0 be de nedas follows: For every vertex in T , there is a vertex in T 0.For every edge in T , there is a corresponding edge in T 0. Inaddition to the basic tree structure of T , each non-leaf nodev of T has an additional child w in T 0. This child w is a leafnode. The purpose of the additional leaf nodes is to accountfor the probability with which particles can disappear in therandom queues in Algorithm 2.Given any node w in T 0 (which is not the root) and itsparent v, let p0(wjv) denote the probability of moving to wconditional on being in v. For the additional leaf nodes win T 0, we set p0(wjv) = 1 p, where 1 p is the probabilitythat a given particle is lost in the queue. For all other pairs(w; v), let p0(wjv) = p p(wjv). Then a0(i), a0(ijv), S0i,s0i,X 0v , and 0 can be de ned similarly for T 0.Given a vertex v of T , let p(v) denote the probability thatAlgorithm 2, when run with a single particle and withoutreproduction, reaches vertex v. The term \without repro-duction" means that the distribution mentioned in the rst\for" loop of the algorithm is such that the number of addedparticles is always zero. The main property of the construc-tion of T 0 is:Fact 1. For any vertex v of the original tree T , p0(v) =p(v). Furthermore, Pr(Algorithm 2 reaches depth `) =p Pr(GWTW on T 0 reaches depth `) for any ` 0.We can now analyze the success probability of Algorithm2 (a combination of GWTW and random queues) by meansof analyzing the success probability of baseline GWTW ona slightly modi ed tree. This allows us to use the resultsof [1] in our analysis. In particular,Lemma 3.ES0i = B0 pi 1a(i)s0i ; 0 i d; andvarS0i 1p B0 pi 1a2(i)s0i2iXj=0 s0jpj 1a(j) ; 0 i dIn order to allow a direct comparison between the boundsof Lemmas 2 and 3, it is necessary to relate the constants(si)1 i<` and(s0i)1 i<`. These constants govern the particlereproduction rate of GWTW and can either be set externallyor determined by a sampling procedure described in [1]. Ifwe sets0i = pi 1si then the expectations of Lemmas 2 and 3are equal and the variance bounds are within a factor of p ofeach other. The variance bound is used in [1] in connectionwith Chebyshev's inequality to provide a lower bound on thesuccess probability of GWTW. It follows that the negativee ect of random queues on the GWTW variance bounds canbe compensated for by increasing the number B of particlesat the root by a factor of 1=p.5. REFERENCES[1] D. Aldous and U. Vazirani. \Go With the Winners"Algorithms. In Proc. of 35th IEEE Symp. onFoundations of Computer Science, pp. 492{501, 1994.[2] D. Bertsekas and J. Tsitsiklis. Parallel and DistributedComputation, Prentice-Hall Inc., Englewood Cli s,NJ, 1989.[3] D. Malkhi and M. Reiter. Byzantine Quorum Systems.Proc. of the 29th ACM Symp. on Theory ofComputing, pp. 569{578, May 1997.[4] D. Malkhi, M. Reiter, and R. Wright. ProbabilisticQuorum Systems. In Proc. of the 16th Annual ACMSymp. on Principles of Distributed Computing, pp.267{273, Aug. 1997.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Randomized Shared Queues Applied to Distributed Optimization Algorithms

This paper presents a speci cation of a randomized shared queue that can lose some elements or return them out of order, and shows that the speci cation can be implemented with the probabilistic quorum algorithm of [5, 6]. Distributed algorithms that incorporate the producer-consumer style of interprocess communication are candidate applications for using random shared queues in lieu of the mes...

متن کامل

Brief announcement: Distributed shared memory based on computation migration Citation

Citation Mieszko Lis, Keun Sup Shim, Myong Hyon Cho, Christopher W. Fletcher, Michel Kinsy, Ilia Lebedev, Omer Khan, and Srinivas Devadas. 2011. Brief announcement: distributed shared memory based on computation migration. In Proceedings of the 23rd ACM symposium on Parallelism in algorithms and architectures (SPAA '11). ACM, New York, NY, USA, 253-256. As Published http://dx.doi.org/10.1145/19...

متن کامل

Brief Announcement: Scalability versus Semantics of Concurrent FIFO Queues∗

Maintaining data structure semantics of concurrent queues such as first-in first-out (FIFO) ordering requires expensive synchronization mechanisms which limit scalability. However, deviating from the original semantics of a given data structure may allow for a higher degree of scalability and yet be tolerated by many concurrent applications. We introduce the notion of a k-FIFO queue which may b...

متن کامل

Can Expensive Synchronization be Avoided in Weak Memory Models? (Brief Announcement)

Process coordination problems have been extensively addressed in the context of sequential consistency. However, modern shared-memory systems present a large variety of ordering constraints on memory accesses that are much weaker than sequential consistency. We re-addressed two fundamental process coordination problems in the context of weak memory models. We proved that many models cannot supp...

متن کامل

Brief Announcement: Fast Shared Counting using (O(n)) Compare-and-Swap Registers

We consider the problem of building a wait-free and linearizable counter using shared registers. The counter supports a read operation, which returns the value of the counter, and an increment operation, which increments the value of the counter and returns nothing. The shared registers support read, write and compare-andswap instructions. We show that given n processes andO(n) shared registers...

متن کامل

Waiting times in queueing networks with a single shared server

We study a queueing network with a single shared server that serves the queues in a cyclic order. External customers arrive at the queues according to independent Poisson processes. After completing service, a customer either leaves the system or is routed to another queue. This model is very generic and finds many applications in computer systems, communication networks, manufacturing systems,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008